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Abstract 

This document discusses an approach and its rudimentary 
realization towards automatic classification of PPs; the 
topic, that has not received as much attention in NLP as 
NPs and VPs. The approach is a rule-based heuristics 
outlined in several levels of our research. There are 7 
semantic categories of PPs considered in this document 
that we are able to classify from an annotated corpus. 



1 Introduction 

Historically, prepositions have not enjoyed the attention 
of nouns and verbs, being until recently relegated to the 
status of "an annoying little surface peculiarity" BJac73L 
However, linguistics tells us that different syntactic cate- 
gories contain distinct semantic characteristics which are 
often exclusive to members of that category |LP91 1. This 
raises two distinct yet equally important questions: How 
are such categories found syntactically, and how are their 
characteristics expressed semantically? 

The famed linguist Sir Randolph Quirk states that "a 
preposition expresses a relation between two entities, one 
being that represented by the prepositional complement" 

K35I85) . 

In this paper we describe the development of a 
heuristics-based system. 



2 Theoretical Foundations 

Through a survey of the literature, it is clear that the study 
of prepositions is becoming more intricate, and is seg- 
mented into a few focussed areas. Our work is therefore 
motivated towards the construction of a sequential system 
of increasingly complex levels, each of which is represen- 
tative of one of these focussed areas. The system is orga- 
nized in such a way that the product of one level becomes 
a dependency of the next as outlined by the following se- 
quence: 

Level 0: From Part-of-Speech annotated text, minimal 
prepositional phrases are found at the syntac- 
tic level according to a context-free grammar 
(CFG), and not categorized. 

Level 1: Minimal prepositional phrases are aug- 
mented with a set of labels indicating classes 
of semantic roles, by means of rule-based 
heuristics. 

Level 2: The proper attachment of the prepositional 
phrase is attempted with shallow heuristics 
based on results of Levels and 1, in case 
of ambiguity. 

Level 3: Semantic characteristics of the PP and its 
co-predicate phrases are analyzed in order 
to perform attachment 'intelligently', and do 
discover more thorough semantic relations. 

Each level is described in more detail in Sections 12.11 
through |2~4] 
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2.1 Discovery of Minimal PPs 
Analysis 



Syntactic 2.1.2 Expansion of PPs 



The first stage is to locate prepositional phrases within a 
text grammatically, irregardless of semantic function. For 
this we turn to linguistics, which studies PPs by their pro- 
duction and by their expansion as outlined below. 

2.1.1 Production of PPs 

Prepositional phrases generally attach to noun phrases and 
verb phrases, usually acting adjectively for the former and 
adverbially for the latter [Hud03 1, as in the example 

(S 
(NP 

(DETthe) (HEAD boy) 
(PP 
in 

(NP 

(DETthe) (HEAD shop) 
))) 
(VP 
is waiting 
(PP 
at 

(NP 

(DETthe) (HEAD corner) 
))) 

) 

At first glance, we should be able to augment grammatical 
noun and verb phrases by simply adding rules of the form 
NP <S= NPPP and VP <*= VPPP llJMOOl . However 
this can lead to overgeneration, especially in cases where 
there exist ambiguity with regards to the part-of-speech 
of a potential preposition. For example, in "Turn of f /rb 
the light", "off" is used as an adverb, but if we are not 
careful with our expansion rules for VP, we could easily 

generate "(VP turn (PP off/IN (NP the light)))" if 

the tagging is done incorrectly. Some such errors in tag- 
ging cannot be avoided within the scope of this project, 
but careful augmentation of existing rules, such as in 
NP <= DET MOD HEAD PP, can result in a more 
fine-tuned grammar, as will be shown in our results (see 
Section and VP expansions producing PPsare given in 
Appendix B. 



Though there is some debate as to the semantic function 
played by the prepositional phrase [FS03], the syntactic 
nature of a prepositional phraseis relatively universally ac- 
cepted. It is very safe to define the prepositional phrase 
specifically as havinga preposition as its head, followed 
by a noun phrase or an entity whichis always the direct 
object HHud03l . 

A full listing of rules for PP expansions are given in 
Appendix A and B. 

2.1.3 Non-minimal PPs 

We will often encounter sequences of contiguous prepo- 
sitional phrases, as in "i saw the fool on/IN the hill 
with/iN the telescope". In such circumstances, we are 
principally concerned with PP-attachment, which is a sig- 
nificant grammatical challenge, but which also involves 
semantic analysis, hence its discussion is delayed until 



Section 2.3 Suffice it to say, we will with some regularity 



encounter such a sequence of PPs where each PP modifies 
the same predicative, and hence we add the production 
rule PP ■<= PP PP to deal with such a circumstance. 

2.1.4 The Mechanism of Discovery 

The mechanism for discovering prepositional phrases is 
an implementation of the Earley parser in Scheme. While 
this provides a simple interface to grammatical tree- 
expansions, based purely on syntactic context-free rules - 
the algorithm is devoid of any stochastic or world knowl- 
edge, and hence cannot be used for more complex gram- 
matical or semantic analysis. The instantiated grammar is 
an instance of a partial parser PP-chunker which searches 
exclusively for prepositional phrases. 

2.2 Semantic Role Annotations & Catego- 
rization 

Prepositions convey significant semantic relations in text, 
and provide "the principal means of conveying semantic 
roles for the supporting entities referred to in a predica- 
tion" |OW02|. They are, however, highly ambiguous - 
with closely related word-senses and a relatively high de- 
gree of internal polysemy. Furthermore, definitions as to 
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how semantic roles are conveyed, and how supporting en- 
tities relate are varied and often imprecise. At present, 
it is not feasible for automatic systems to attain a degree 
of semantic granularity comparable to that of collegiate 
dictionaries, but we can at least categorize different uses 
of the prepositional phrase by broad semantic characteris- 
tics. 

2.2.1 Semantic Role Annotations 

The Penn Treebank described the prepositonal phrase as 
having semantic-role subcategorizations defined by case- 
style relations, the 7 most frequent of which are shown 
with their associated frequencies of occurrence in Table 
[T] Although by no means a thorough description of the 
semantic relations within a text, such subcategorizations 
allow for a suitable indication of the manner in which a 
prepositional phrase is used, and therefore to how they 
modify the semantics of the phrase to which they attach. 



tag 


Freq. 


Description 


PP-LOC 


17220 


locative 


PP-TMP 


10572 


temporal 


PP-DIR 


5453 


direction 


PP-MNR 


1811 


manner 


PP-PRP 


1096 


purpose 


PP-EXT 


280 


extent 


PP-BNF 


44 


beneficiary 



Table 1 : Subcategorization of augmented PPs, ordered by 
frequency of occurence in the |Bie95ll. 

These semantic relations can be attached to any verb 
complement, but more frequently occur with noun phrases 
and their clauses. The University of Pennsylvania pro- 
vides online annotated texts from SIGLEX'99 [sei03| 
|tru03 1 with parsed examples showing each of these cate- 
gories: 

PP-LOC: "...a federal grand jury (PP-LOC in (NP 
Newark))..." Isei03l 

PP-TMP: "The people who suffer (PP-TMP in (NP the 
short term))..." Isei03l 

PP-DIR: "Citicorp had discussed lowering the offer 
(PP-DIR to (NP $250 a share))" Htru03l 



PP-MNR: "He could be left (PP-MNR without (NP 
top-flight legal representation))..." Bsei03H 

PP-PRP: "...prosecutors told Mr. Antar's lawyers 
that (PP-PRP because of (NP the recent 
Supreme Court rulings))..." Isei03l 

PP-EXT: "AMR declined (PP-EXT by (NP 
$22,125))..." Eu03l 

PP-BNF: "I baked a cake (PP-BNF for (NP Doug))" 
lBie95l 

It is with this simple framework of 7 semantic annota- 
tions that we choose to develop our system. Although our 
heuristics will be designed to suit these categories, alter- 
native classifications of prepositional phrases can be used 
to help obtain insight towards this process. 

2.2.2 Alternative classifications 

Alternative classifications for prepositional phrases have 
been discussed within the domain of semantic encoding in 
lexica [EAG96] where labels are assigned to prepositions 
but describe whole phrases according to whether they ap- 
ply to PPs modifying predicative heads (verbs and pred- 
icative nouns) or to PPs modifying nonpredicative heads 
(nonpredicative nouns). This distinction is justified se- 
mantically and syntactically^ and most labels associated 
with predicative-head modifiers coincides directly with 
the semantic role annotations in Section 12.2.11 

The subtlety such a distinction makes is shown for 
modifiers of non-predicative heads, all of which describe 
quality modifications. Consider, for instance, the differ- 
ence between a place-position (locative) modifier in "the 
report is ON THE TABLE" and a quality-place modi- 
fiers in "oranges FROM SPAIN" (origin) and "a road 
THROUGH EGYPT" (path). Even subtle semantic dif- 
ferences in positional modifiers can be noted depending 
on the predication of the head. Although such a distinc- 
tion is not replicated in our system, the motivation behind 
the analysis of the semantic qualities of the head is applied 
theoretically to our heuristics. 

1 From a syntactic point of view, the choice of the prepositional label 
modifying nonpredicative heads is much more restricted than for pred- 
icative heads, although more predominantly in the romance languages 
than in the germanic family. Generally, nonpredicative prepositional 
modifiers 'behave' more like adjectives, whereas predicative preposi- 
tional modifiers 'behave' more like adverbs. Isei031 
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2.2.3 Temporal PPs (TPPs) 

Of those semantic categorizations common across mul- 
tiple paradigms, temporal prepositional phrases have re- 
ceived special attention. Ian Pratt et al. [PF97], show 
that temporal prepositional phrases with NP comple- 
ments, and also with PP complements (validating our non- 
minimal grammar decision!), and even sentential comple- 
ments with quantification-restrictions and structural am- 
biguities can be grouped into a unified theory of gener- 
alized temporal quantifiers which serve as meanings for 
temporal PPs specifically, but also to NPs and sentences. 
Their theory differs from the normal categorization theory 
in that their semantics rely heavily on the A-calculus and 
warp functions, and hence is only used as a theoretical 
landmark. 

It should be noted that in response to Pratt et. al, Nis- 
sim Frances et al. argues that the unorthodox se- 

mantic operations and lack of syntactic cues in that paper 
makes such an approach infeasible for practical purposes, 
and that a simpler syntactic framework (indexical prepo- 
sitional phrases), would better be employed, which is an 
approach somewhat closer to our implementation. 

2.2.4 Categorization Heuristics 

The heuristic approach to the problem of semantic catego- 
rization is easily implemented in a transformational PERL 
script that accepts simple PP partial parses and transforms 
the tag PP to PP-XXX (where XXX is one of the anno- 
tations of Section [2.2. 1) . Syntactic cues from the partial 
parse include the following: 

1. The lexical entry of the head preposition 

2. The head of the component noun phrase 

It is tempting to produce heuristics based exclusively on 
the preposition alone (1), since intuitively the preposi- 
tion should indicate the manner in which a PP is being 
used. For instance, in the examples "put the letter in 

the mailbox , the dog is IN the space capsule ' and "The 

farmer in the deii", the preposition "in" is always used 
as a locative preposition. A moment's reflection will re- 
veal that this approach is extremely naive, because most 
prepositions can be used for multiple functions, as in the 
following examples: 

2 Who, ironically, contributed to the paper under criticism. 



• ex. "The transient sleeps WITHIN the cardboard 
box" (LOC) vs. "I'll be back WITHIN three weeks" 
(TMP) 

• ex. "I kicked the ball TOWARDS the net" (DIR) vs. 
"My politics tend TOWARDS the left" (MNR) 

• ex. "I'll see you IN a couple of weeks" (TMP) 

Our heuristics cannot in the vast majority of cases con- 
sider the preposition alone because prepositions can of- 
ten be used under different semantic relations. However, 
some prepositions, such as "when" or "because" are so 
often associated with a single semantic function (tempo- 
ral and purpose, respectively), that we can make minimal 
use of the preposition's lexical entry, but not exclusively. 
We implement a two-layer system of subsuming heuris- 
tics - the first which makes a "guess" at the class depend- 
ing on lexical knowledge associating words with semantic 
classes. For instance, 

TMP: when, until, in, during, after, before, while, 
under, over, then, since, around, at, through- 
out 

This will of course lead to instances where certain am- 
biguous prepositions such as "for" are always guessed to 
be of the most frequent category - but this is excusable 
because of the second heuristic layer. 

The second heuristic layer subsumes (overrides deci- 
sions made in) the first, and forms the most significant part 
of system and makes use of semantic knowledge in Word- 
Net with regards to the head of the PP's component noun 
phrase in order to guide semantic classification. Specifi- 
cally, depending on the NP expansion rule that expands 
the direct object of the preposition, the relevant head of 
the noun phrase is extracted. For instance, for pronouns 
and proper NPs, the whole NP is considered, otherwise, 
only the grammatic head of the noun phrase (which has 
been annotated automatically) is considered. 

Given the relevant head of the noun phrase, hypernyms 
for that word is looked up in WordNet using the com- 
mand 'wn $noun -hypen'. This will often result in mul- 
tiple possible hypemym expansions due to the slightest 
polysemy of the noun phrase, so only the first few sense^j 

3 In WordNet, this translates as being the few most frequent senses, 
as determined through corpus-based training. 
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up to some threshold (4 or 5) are considered. The hy- 
pernym trees are then searched in decreasing frequency 
of sense for particular keywords indicating semantic role, 
for instance the hypernym "time period", which can be 
derived from the noun "yesterday", indicates a temporal 
relation. This search is done in such a way that the most 
common semantic categories are given priority. 

This approach is similar to (and inspired by!) ap- 
proaches using FrameNet as the semantic resource, and 



an evaluation of this technique follows in Section 3.2.2 



2.3 PP Attachment 

Although prepositional phrases "... can appear within all 
the other major phrase types" |MS02|, the task of achiev- 
ing automatic PP-attachment syntactically has tradition- 
ally required attachment either to a verb phrase or to a 
noun phrase, historically accomplished by means of a spe- 
cific binary decision between two particular methods: 

Right Association: A constituent tends to attach to an- 
other immediately to its right. This approach 
favours attachment to the noun MKim731 . 

(ex. "I (VP saw (NP the dog (PP with its 
puppies ) ) " ) 

Minimal Attachment: A constituent tends to attach to ex- 
isting nonterminals using the fewest interme- 
diate nodes. This approach favours attach- 
ment to the verb |Frz78 1. 

(ex. "I (VP saw (NP the dog) (PP with my 
binoculars) ) " ) 



However, Whittemore et al. [Whi90|, showed that nei- 
ther of these complementary tactics can account for more 
than 55% of cases in general texts - and that each are poor 
predictors of how people resolve ambiguity |BR02|. The 
first solution to this shortcoming is to involve statistics 
and lexical knowledge. It has been shown in Brill and 
Resnik (1994) [BR§1. "i saw the fool on the hill 
with the telescope", where there exist numerous ex- 
amples of ambiguity, but our concern is with the ambi- 
guity of PP-attachment. Does the prepositional phrase 
"on the hill" refer to the location of the fool, or of 
the speaker? Does the prepositional phrase "with the 
telescope" attach to the verb "saw", or to either of the 

nouns "the fool" Or "the hill"? 



Both of these suggested improvements to a syntac- 
tic approach to PP-attachment - corpus-based statistical 
learning and world knowledge (with inference), are be- 
yond the scope of this project. 

2.4 Consideration of Semantics 

Determening semantics (Level 4) is beyond the scope of 
this work presentely due to time constraints. However, 
we present a few general ideas on how that could possi- 
bly be done. One (rather shallow) way is to build sub- 
categories of the 7 classes we have used. For example, 
locative-type PPs may have lexical entries with seman- 
tic relations, such as "part-to-whole", "betweenness", and 
"relative distance" relations, which build up "path" and 
"orientation" structures for "movement-directional" and 
"stative-locational" interpretations according to [Nam95|. 
Similarly to VPs, we'll have to look at the (semantics of) 
argument structure of such PPs. In general, a hierarchy of 
subcategories of the original classes of PPs will have to 
be implemented. That would possibly require more than 
two passes of over the parses we currently have. Tests for 
argument structure can be done as presented in [ Ver | . 

3 Experiments & Analysis 

In order to gauge the effectiveness of our heuristics-based 
approach, the evaluation of our system on corpora, and 
the subsequent experimental analysis must be performed. 
Such experiment involves a 'training' phase where the 
syntactic and semantic characteristics of the prepositional 
phrases are analyzed and represented by heuristics, and a 
'testing' phase where the efficacy of those heuristics are 
measured statistically. The process is described in the fol- 
lowing subsections. 

3.1 Experimental Setup 
3.1.1 Corpora 

For the purposes of experiment, a collection of texts from 
the Wall Street Journal (WSJ) corpus are chosen at ran- 
dom to be representative of the available corpora, and 
are divided into two sets: The first is comprised of 20 
texts (-74%) used for 'training' of the system - that is, 
our heuristics are modified to best suit these documents. 
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891027-0018 .txt 



891027-0056 . txt 



891027-0101.txt 



891027-0181 . txt 



891030-0028.txt 



91027 C023.txt 8 91027-0 03 8.txt 



891027-0066. txL 



891027-0103 .txt 



391030-C008.txt 



891030-0037. tx 



891027-0082 .txt 



891027-0114.txt 



891030-0011 .txt 



891030-0066 .txt 



891027-0047. tx 



891027-0090 . tx 



891027-0172.txt 



891030-0020.txt 



6 91020 - 0065 . tx 



Table 2: Training documents from WSJ corpus 



91027-0007.txt 891027-0040.txt 8 91027-0 081.txt 



891027-0111.txt 



891030-0019.txt 



891030-0093 .txt 



891027-0099 . tx 



Table 3: Testing documents from WSJ corpus 



The second set is comprised of 7 texts (-26%) used for 
'testing' the system - from which our empirical measures 
of performance are derived. The breakdown is shown in 
Tables|2]and[3] 

3.1.2 Human Annotation 

All 27 documents were hand-annotated for the occurence 
and grammatical structure of prepositional phrases. This 
task was facilitated by two factors: 

• Only prepositional phrases (and their constituents) 
were annotated, saving the annotators the trouble of 
doing full sentence parsing. 

• As a consequence of the first point, PP-attachment 
was not taken into consideration. 

Subsequently the parses were augmented with the seman- 
tic role annotations from Section 12.2.11 In cases where 
the prepositional phrase did not fall into one of the 7 se- 
mantic categories, the tag was left simply as PP, without 
additional markup. 

The annotation of the 20 training documents guided the 
production of rules for grammatical expansion and heuris- 
tics for semantic role markup. The annotation of the 7 test 
documents allowed for numeric performance measures, 
and statistical evaluation. 

3.1.3 Measures of Performance 

Two principal measures of performance are precision and 
recall, defined thusly [Qui85|: 



Precision: P = 



H of correct answers given by the system 
tt of answers given by the system 



Recall: 



| of correct answers given by the system 



total )] of possible answers in the text 

For each of these measures, an 'answer' can be defined 
with regards to the task at hand, and will be specified in 



the Section 3.2 For instance, when measuring the bare 
PP-chunker grammar, recall would be defined as the to- 
tal number of correctly identified prepositional phrases 
among all actual prepositional phrases in a text. 

An additional statistical measure is the F-Measure, 
which describes the combined measure of precision and 
recall, and is described by 



F = 



(/3 2 



I) PR 



P 2 P + R 



where we consider f3 = 1 (equal weight to precision and 
recall). Non-statistical measures of performance include 
computational measures and the qualitative evaluation of 
rules. 

3.2 Results & Analysis 

3.2.1 Evaluation of PP-grammar 

The discovery of minimum non-characterized PPs, which 
comprises the first level of the system, is first analyzed in- 
dependently, since its performance will drastically affect 
the performance of later stages. 

The first pass of evaluation leads to a recall of R = 
§§§ « 81.8% and a precision of P = §§§ w 73.8%. The 
relatively poor precision is quickly discovered to be a re- 
sult of poor preprocessing of the WSJ texts, which results 
in errors in tagging. Specifically, erroneous prepositional 
phrases are found in headers of the WSJ texts and at the 
ends of sentences at at points where punctuation causes 
difficulty to Brill's Tagger, where errors are likely made 
during the transformational tagging stage. 

With additional preprocessing steps for those circum- 
stances listed, recall improved to R = « 81.8% 
and precision to P = ||| w 79.3%. Though a sig- 
nificant improvement, additional improvements are pos- 
sible. Analysing the training corpus, we find that many 
potential prepositional phrases will not parse due to poor 
parsing of more complex noun phrases which would nor- 
mally form the object of the PP. Furthermore, the rule 
PP <= IN RB, which was originally included to deal 
with phrases of the form "in particular", was overgener- 
ating many parses. 
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a? 80 - 

78 - 

76 - 

74 r--'"' 

72 ' 1 ' 1 

1 1.5 2 2.5 3 

Iteration 

Figure 1: Precision and recall for evaluation of PP- 
chunker. 

Modifying the rules for NPs to take into consideration 
aggragate NPs, and removing the rule PP <= IN RB 
results in a final recall of R — ||| ps 87.8% and precision 
to P = ||y ss 85.0%, which translates into an F-Measure 
ofFw 86.4%. 

Evidently, relatively high precision and recall can be 
achieved for prepositional phrases with relatively few 
grammatical rules. This is in stark contrast to the auto- 
matic noun phrase or verb phrase chunkers, which tradi- 
tionally score lower and require more numerous and more 
complexrules. This tends to validate the theory in the lit- 
erature which paints PPs ashaving a basic syntactic struc- 
ture, as described in Section [2T| The improvements made 
at each evaluation pass are summarized in Figure [1] 

3.2.2 Evaluation of Automatic Categorization 

Utilizing lexical knowledge first, then subsuming it with 
semantic knowledge,leads to very encouraging results. 
Measuring the recall, precision, and F-measure as pre- 
viously defined for all 7 test, we break down the analy- 
sis into the 4 most frequent catgories, where un-annotated 
PPs are considered towards the total positively if they do 
not strictly fit into our seven categories, and negatively 
otherwisen This breakdown is shown in Table 

It is immediately recognized that our heuristics seem 
exceptionally well-suited for directional-PPs, and excep- 

4 So total fractions do not necessarily equal the sum of its parts. 





Recall 


Precision 


F -measure 


Locative PPs only 


147 _ Bf.-Lfb 


iff = 89.5% 


88.3% 


Temporal PPs only 


— 7Q fi% 
98 i y • o To 


m =8i-3% 


80.5% 


Directional PPs only 


|| = 86.1% 


|| = 91.9% 


89.0% 


Manner PPs only 


If = 65 .4% 


— fiD 7% 
28 DU - ' /C 


63.1% 


Total 


§§8 = 79 ' 4% 


HI = 83.5% 


81.5% 



Table 4: Evaluation of Automatic Categorization 



tionally poorly suited to manner-PPs. Such a discrepancy 
can of course not be linked merely to the heuristics for 
these two categories, since the system is not independent 
and heuristics for other categories play a significant role 
with each other. 

It should be noted that as more potential categories are 
taken under consideration, the more error we can expect 
due to the relative higher rate of perplexity. 

4 Future Work 

Most of the future work would be dedicated to the Levels 
3 and 4 outlined at the beginning, i.e. PP-attachment and 
Semantic Analysis of PPs. Both problems are quite dif- 
ficult to solve and will require a lot of research-trial-and- 
error attempts of the existing and new proposals. New, 
more robust, tools will be required, such as stochastic 
methods in NLP and a lot more training and testing data. 
For example, we could assign probabilities to our gram- 
mar rules and encode the attachment information into the 
syntactic productions. Likewise, the detection of semantic 
roles of PPs and their arguments can be done heuristically 
and statistically with two or more passes. 
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5 Appendix 

5.1 Appendix A: PP Rule Set 

The rule set is quite small and simple provided the rest of 
the grammar has a quite a notion of what NP and VP are. 
This basic rule set is used for Level of our work. This 
gives us a list of unclassified PPs from a corpus. Later on, 
a Perl script run to augment this information with the 7 
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categories outlined in 2.2. 1 keeping PP as a general clas- 
sification if we are unsure what kind of PP we are look- 
ing at. This second pass of annotation provides classi- 
fication information of Level 1 of out work. The entire 
Level grammar is in pp-chunker . scm presented in 
Appendix C.The Level 1 heuristic rules are presented in 
Appendix C, the augment -pp . pi Perl Scrip. 
The below are the sample rules used: 

(PP (IN NP) ; "on new rules", "for him", "by 
Doug" 

(IN IN NP) ; "because of the rain" 

(PP PP) ; "on new rules for covert 
operations " 

; (NB. we don't care about 

attachment ) 

(TO NP) ; "to committee officials" 

(IN NP VP) ; "of the top dog running the show" 

; (NB. spurious, but not harmful) 

) 

5.2 Appendix B: PP Chunker (pp- 
chunker.scp) 

See [pp-chunker . scp] 

5.3 Appendix C: PP Augmenter Script 
(augment-pp.pl) 

See [augment-pp.pl] 
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